Technical Report CAIP TR SPEAKER AND ENVIRONMENT ADAPTATION IN CONTINUOUS SPEECH RECOGNITION

نویسنده

  • Prabhu Raghavan
چکیده

Hidden Markov Models HMMs have been used with consider able success in continuous speech recognition It is well known that high accuracy can be obtained when the HMM system is trained and tested in a quiet environment and the speech signal is acquired from a close talking microphone However mismatches between training and testing environment severely degrade erformance Two major sources of mismatches are speaker and environment variability Speaker variation is typ ically caused by di erent speaking styles and other physiolog ical di erences between speakers such as vocal tract lengths etc Environment variability includes channel distortion such as that which a ects telephone speech additive noise and re verberation which results when the microphone is far away from the speaker The goal of this report is to explore di erent adaptation al gorithms that mitigate the e ects of speaker and environmen tal variability for speech recognition The adaptation algorithm closely examined in this report is a Linguistic Tree based Maxi mum Likelihood Linear Regression LT MLLR Speech Recog nition experiments using the LT MLLR for speaker and envi ronment adaptation are given It is shown that the LT MLLR algorithm is superior to other adaptation algorithms discussed For speaker adaptation a reduction is achieved over the baseline word error rate WER using this algorithm In addition it is shown that the use of Matched Filter Array Processing MFA with LT MLLR reduces the WER of distant talking speech with high reverberation In the case when the reverberation time is as high as s the WER is reduced from to a reduction of This work was supported by DARPA Contract DABT C

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Remes Speaker - Based Segmentation and Adaptation in Automatic Speech Recognition

With proper training, automatic speech recognition works quite well when tested in conditions similar to the training conditions, but with a new speaker or a new environment the system performance often degrades. Speaker-based adaptation alters the speech recognition system to better match a specific speaker and thus improves the speech recognition results. In order to use speaker adaptation, t...

متن کامل

Speaker normalization training for mixture stochastic trajectory model

In this paper we are interested in speaker and environment adaptation techniques for speaker independent (SI) continuous speech recognition. These techniques are used to reduce mismatch between training and the testing conditions, using a small amount of adaptation data. In addition to reducing this mismatch during the adaptation, we propose to reduce the variation due to speakers or environmen...

متن کامل

Speaker adaptation of quantized parameter HMMs

This paper extends the evaluation of Hidden Markov Models with quantized parameters (qHMM) presented in [5] to the case of speaker adaptive training. In speaker-independent speech recognition tasks, qHMMs were found to provide a similar performance as the original continuous density HMMs (CDHMM) with substantially reduced memory requirements. In this paper, we propose a Bayesian type of adaptat...

متن کامل

Techniques for robust speech recognition in the car environment

The use of voice commands or navigation features in the car is becoming a necessity. As keyboard and display interfaces cannot be used safely while driving, much effort has been done to make automatic speech recognition (ASR) and Text-to-Speech synthesis (TTS) ubiquitous features in the car. From voice dialing to car navigation, the requirements for voice technology vary greatly. While the use ...

متن کامل

Speaker Adaptation from Limited Training in the BBN BYBLOS Speech Recognition System

The BBN BYBLOS continuous speech recognition system has been used to develop a method of speaker adaptation from limited training. The key step in the method is the estimation of a probabilistic spectral mapping between a prototype speaker, for whom there exists a well-trained speaker-dependent hidden Markov model (HMM), and a target speaker for whom there is only a small amount of training spe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998